Search CORE

159 research outputs found

Crowdbreaks: Tracking Health Trends using Public Social Media Data and Crowdsourcing

Author: Mueller Martin
Salathé Marcel
Publication venue
Publication date: 14/05/2018
Field of study

In the past decade, tracking health trends using social media data has shown great promise, due to a powerful combination of massive adoption of social media around the world, and increasingly potent hardware and software that enables us to work with these new big data streams. At the same time, many challenging problems have been identified. First, there is often a mismatch between how rapidly online data can change, and how rapidly algorithms are updated, which means that there is limited reusability for algorithms trained on past data as their performance decreases over time. Second, much of the work is focusing on specific issues during a specific past period in time, even though public health institutions would need flexible tools to assess multiple evolving situations in real time. Third, most tools providing such capabilities are proprietary systems with little algorithmic or data transparency, and thus little buy-in from the global public health and research community. Here, we introduce Crowdbreaks, an open platform which allows tracking of health trends by making use of continuous crowdsourced labelling of public social media content. The system is built in a way which automatizes the typical workflow from data collection, filtering, labelling and training of machine learning classifiers and therefore can greatly accelerate the research process in the public health domain. This work introduces the technical aspects of the platform and explores its future use cases

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crowdbreaks: Tracking Health Trends Using Public Social Media Data and Crowdsourcing

Author: Marcel Salathé
Marcel Salathé
Marcel Salathé
Martin M. Müller
Martin M. Müller
Martin M. Müller
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2019
Field of study

In the past decade, tracking health trends using social media data has shown great promise, due to a powerful combination of massive adoption of social media around the world, and increasingly potent hardware and software that enables us to work with these new big data streams. At the same time, many challenging problems have been identified. First, there is often a mismatch between how rapidly online data can change, and how rapidly algorithms are updated, which means that there is limited reusability for algorithms trained on past data as their performance decreases over time. Second, much of the work is focusing on specific issues during a specific past period in time, even though public health institutions would need flexible tools to assess multiple evolving situations in real time. Third, most tools providing such capabilities are proprietary systems with little algorithmic or data transparency, and thus little buy-in from the global public health and research community. Here, we introduce Crowdbreaks, an open platform which allows tracking of health trends by making use of continuous crowdsourced labeling of public social media content. The system is built in a way which automatizes the typical workflow from data collection, filtering, labeling and training of machine learning classifiers and therefore can greatly accelerate the research process in the public health domain. This work describes the technical aspects of the platform, thereby covering the functionalities at its current state and exploring its future use cases and extensions

Directory of Open Access Journals

The evolution of complexity on the level of genes, individuals and populations

Author: Salathé Marcel
Publication venue: Ecole Polytechnique Fédérale de Zürich (ETHZ)
Publication date: 03/12/2015
Field of study

Infoscience - École polytechnique fédérale de Lausanne

On the Ground Validation of Online Diagnosis with Twitter and Medical Records

Author: Bodnar Todd
Barclay Victoria C
Ram Nilam
Tucker Conrad S
Salathé Marcel
Publication venue
Publication date: 01/01/1995
Field of study

Social media has been considered as a data source for tracking disease. However, most analyses are based on models that prioritize strong correlation with population-level disease rates over determining whether or not specific individual users are actually sick. Taking a different approach, we develop a novel system for social-media based disease detection at the individual level using a sample of professionally diagnosed individuals. Specifically, we develop a system for making an accurate influenza diagnosis based on an individual's publicly available Twitter data. We find that about half (17/35 = 48.57%) of the users in our sample that were sick explicitly discuss their disease on Twitter. By developing a meta classifier that combines text analysis, anomaly detection, and social network analysis, we are able to diagnose an individual with greater than 99% accuracy even if she does not discuss her health.Comment: Presented at of WWW2014. WWW'14 Companion, April 7-11, 2014, Seoul, Kore

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Publikationsserver der RWTH Aachen University

On the Ground Validation of Online Diagnosis with Twitter and Medical Records

Author: Barclay Victoria C
Bodnar Todd
Ram Nilam
Salathé Marcel
Tucker Conrad S
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

The Red Queen and the persistence of linkage-disequilibrium oscillations in finite and infinite populations

Author: Bonhoeffer Sebastian
Kouyos Roger D
Salathé Marcel
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The Red Queen Hypothesis (RQH) suggests that the coevolutionary dynamics of host-parasite systems can generate selection for increased host recombination. Since host-parasite interactions often have a strong genetic basis, recombination between different hosts can increase the fraction of novel and potentially resistant offspring genotypes. A prerequisite for this mechanism is that host-parasite interactions generate persistent oscillations of linkage disequilibria (LD). Results We use deterministic and stochastic models to investigate the persistence of LD oscillations and its impact on the RQH. The standard models of the Red Queen dynamics exhibit persistent LD oscillations under most circumstances. Here, we show that altering the standard model from discrete to continuous time or from simultaneous to sequential updating results in damped LD oscillations. This suggests that LD oscillations are structurally not robust. We then show that in a stochastic regime, drift can counteract this dampening and maintain the oscillations. In addition, we show that the amplitude of the oscillations and therefore the strength of the resulting selection for or against recombination are inversely proportional to the size of the (host) population. Conclusion We find that host parasite-interactions cannot generally maintain oscillations in the absence of drift. As a consequence, the RQH can strongly depend on population size and should therefore not be interpreted as a purely deterministic hypothesis.</p

Infoscience - École polytechnique fédérale de Lausanne

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central